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This  publication  Is  primarily  a  ttorklng  paper.  It  Is  published  solely  to  document  wcilc  performed 


United  States  Air  Force  (USAF)  pilot  candidates  were  administered  a  computerized  test 
battery,  the  Basic  Attributes  Tests  (BAT),  that  Is  currently  being  validated  for  use  In  pilot 
selection  and  classification.  Included  In  the  battery  were  five  tests  measuring  personality  and 
attitudinal  characteristics.  These  tests  were  evaluated  singly  and  in  combination  In  terms  of 
their  ability  to  enhance  the  prediction  of  pilot  training  outcomes,  relative  to  that  prediction 
offered  by  the  paper-and-pencll  measures  being  used  operationally.  Based  on  results  from  the 
present  data.  It  was  recommended  that  four  of  the  five  tests  under  review  be  eliminated  from  the 
BAT  and  that  other  measures  of  personality  and  attitudinal  characteristics  be  evaluated  for 
possible  Inclusion  In  a  subsequent  version  of  the  BAT  battery. 


PREFACE 


This  work  was  coanleted  under  Work  Unit  77191845  In  support  of  a  Request  for 
Personnel  Research  (RPR  78-11,  Selection  for  Pilot  Training)  submitted  by  Air  Fon:e 
training  program  managers.  This  paper  Is  Intended  to  serve  as  Interim  documentation 
regarding  the  personality /attitudinal  tests  of  the  Basic  Attributes  Tests  (BAT)  battery. 
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PERSONALITY,  ATTITUDES,  AND  PILOT  TRAINING 
PERFORMANCE:  FINAL  ANALYSIS 


I.  INTRODUCTION 

Most  research  Into  military  pilot  selection  and  classification  has  concentrated  on 
psychomotor  skills  and  perceptual /cognitive  abilities  (e.g.,  Imhoff  t  Levine,  1981). 
Relationships  among  pilot  personality,  attitudes,  and  performance  have  been  researched  less, 
althovgh  Interest  In  the  topic  dates  back  to  World  War  I  (North  S  Griffin,  1977).  The  present 
technical  paper  focuses  on  recent  efforts  to  validate  a  number  of  personality  and  attitude 
measures  Included  In  a  computerized  battery  of  tests  currently  being  evaluated  by  the  United 
States  Air  Force  (USAF)  called  the  Basic  Attributes  Tests  (BAT)  battery. 

Although  In  the  past  50  years  several  studies  have  explored  relationships  between  pilot 
characteristics  and  per*formance,  there  has  been  little  progress  In  using  measures  of  Individual 
differences  to  predict  aviator  training  and  performance  criteria  (Griffin  &  Mosko,  1977),  Two 
associated  factors  may  account  for  the  weak  relationship  between  personality  tests  and  outcome 
criteria.  One  Is  that  the  tests  generally  have  focused  on  distinguishing  between  normal  and 
abnormal  Individuals.  The  seco:id  factor  Is  that  such  tests  have  been  prone  to  response  bias; 
that  Is,  subjects  guess  what  the  test  Is  designed  to  measure  and  fake  their  responses  accordingly. 

Recent  developments  In  personality  testing  have  addressed  both  of  these  Issues.  One 
development  has  been  the  design  of  tests  In  which  the  dimension  being  measured  Is  not  Immediately 
apparent.  A  number  of  these  measures  have  been  used  In  the  Air  Force  (Mullins,  1960,  1962),  such 
as  Dot  Estimation  and  Self-Crediting  Word  Knowledge. 

Another  development  In  personality  testing  has  been  the  design  of  tests  In  which  the  response 
alternatives  to  Items  are  equivalent  In  terms  of  social  desirability,  minimizing  the  tendency  of 
subjects  to  fake  their  responses  (North  &  Griffin,  1977).  The  Activities  Interest  Inventory,  for 
example,  requires  the  subject  to  choose  between  two  activities  which  differ  only  In  the  degree  of 
riskiness  associated  with  those  activities. 

A  third  development  Is  the  Increasing  use  of  personality  tests  to  select  for  positive 
attributes,  as  opposed  to  screening  for  possible  pathological  attributes.  Helmrelch  and  his 
colleagues,  for  example,  have  found  that  among  both  airline  and  general  aviation  pilots  the 
characteristics  of  self-assertiveness.  Interpersonal  orientation  and  achievement  motivation  are 
each  associated  with  attitudes  and  performance  (Helmrelch,  1982;  Slem,  1987;  Slem  i  Helmrelch, 
1986.) 

The  five  tests  described  below  were  selected  for  Inclusion  In  the  BAT  battery  to  measure 
domains  Identified  as  having  potential  for  pilot  selection  and  classification  (Imhoff  A  Levine, 
1981).  In  particular,  the  tests  focus  on  the  measurement  of  decision-making  style,  risk-taking 
attitudes,  self-confidence  and  field  dependence /Independence  (see  Table  1).  These  measures  were 
chosen  based  on  the  observation  that  a  pilot,  particularly  when  flying  a  jet  fighter,  must 
analyze  accurately  situations  that  Involve  a  high  degree  of  risk  and  then  respond  decisively  yet 
without  acting  Impulsively  (Imhoff  A  Levine,  1981). 

As  their  use  was  Intended  to  Improve  present  USAF  pilot  selection  practices,  these 
personality  and  attitude  measures  were  assessed  here  In  terms  of  their  ability  to  explain  unique 
variance  In  the  various  criteria  In  pilot  training  performance;  that  Is,  criterion  variance  over 
and  above  that  explained  by  the  currently  used  selection  Instruments  (subtest  scores  of  the  Air 
Force  Officer  Qualifying  Test  [AFOQT]),  Because  the  AFOQT  subtests  are  cognitive/perceptual  In 
nature.  It  was  expected  that  they  would  not  be  correlated  highly  with  the  personal Ity/attitudinal 
measures  from  the  BAT. 


1 


11.  METNOO 


Subjects 


The  subjects  In  thts  stucly  were  1,992  USAF  officer  candidates  tested  on  the  Basic  Attributes 
Tests  (BAT)  battery.  As  not  all  BAT-tested  subjects  were  accepted  Into  Undergraduate  Pilot 
Training  (UPT)  or  coeipleted  the  training,  the  sanple  sizes  for  the  various  prediction  and 
criterion  measures  vary  (see  Table  2).  For  a  definition  of  criterion  measures,  see  below. 


Table  2.  Numbers  of  Subjects  Available 


Prediction/criterion  measures 

N 

AFOQT  BAT  Personality  Tests 

1,992 

UPT  Outcome  (pass/fall) 

812 

ATRB  Rating  (TTB/FAR) 

534 

Instrumentation 


The  AFOQT  1$  a  paper*and-penc11  test  battery  consisting  of  16  subtests.  Scores  from  the 
subtests  are  combined  Into  five  composite  measures:  Verbal,  Quantitative,  Academic  Aptitude 
(Verbal  and  Quantitative  combined).  Pilot,  and  Navigator-Technical.  See  Table  3  for  the  subtests 
that  make  up  each  AFOQT  composite. 


Table  3.  Composition  of  AFOQT  Form  0  Aptitude  Composites 


Subtest 

Verbal 

Quantitative 

Academic 

aptitude 

Pilot 

Navigator- 

technical 

Verbal  Analogies 

X 

X 

X 

Arithmetic  Reasoning 

X 

X 

X 

Reading  Comprehension 

X 

X 

Data  Interpretation 

X 

X 

X 

Word  Knowledge 

X 

X 

Math  Knowledge 

X 

X 

X 

Mechanical  Comprehension 

X 

X 

Electrical  Maze 

X 

X 

Scale  Reading 

X 

X 

Instrument  Comprehension 

X 

Block  Counting 

X 

X 

Table  Reading 

X 

X 

Aviation  Information 

X 

Rotated  Blocks 

X 

General  Science 

X 

Hidden  Figures 

X 

In  the  analyses  described  below,  raw  scores  for  the  16  subtests  are  used  rather  than 
composite  scores.  This  was  done  for  two  reasons:  first,  to  Identify  the  content  areas  of  the 
AFOQT  that  are  related  most  closely  to  flight  training  performance;  second,  to  determine  whether 
the  BAT  personality /attitude  tests  are  able  to  explain  unique  variance  In  flight  training 
performance  not  accounted  for  by  the  16  AFOQT  subtests. 


Dot  Estlwitlon 


Th«  p$ycholog1ca1  factor  assessed  by  this  test  Is  compulslveness/declslveness.  Tmo  boxes 
containing  an  arbitrary  number  of  dots  are  presented  on  the  screen.  One  of  the  two  boxes  has  one 
more  dot  then  the  other.  The  subject's  task  Is  to  determine,  as  quickly  as  possible,  which  of 
the  two  boxes  contains  the  greater  number  of  dots.  The  subject  Is  not  told  to  count  the  dots  In 
each  box,  but  told  only  to  decide  as  quickly  end  accurately  as  possible  which  has  the  greater 
number. 

In  the  present  effort,  reaction  time  and  accuracy  of  response  were  recorded  on  each  trial. 
This  was  the  only  test  In  the  battery  that  had  a  fixed  time  limit  (S  minutes,  maximum  of  55 
trials). 


Risk-Taking 

This  test  assesses  risk-taking  tendency  In  making  decisions.  Ten  boxes  are  presented  In  two 
rows  of  five  boxes  each.  The  subject  Is  told  that  nine  of  the  ten  boxes  contain  a  reward,  whereas 
one  of  the  boxes  Is  a  "disaster"  box.  The  subject  Is  allowed  to  select  the  boxes  one  at  a  time. 
If  the  selected  boxes  contain  a  payoff,  the  subject  Is  allowed  to  keep  It;  but  If  the  subject 
chooses  the  disaster  box,  all  of  the  payoff  earned  on  that  trial  Is  lost.  The  average  number  of 
boxes  selected  provides  an  Index  of  the  subject's  tendency  for  taking  risks  when  making  decisions. 

Response  time  per  choice  and  number  of  boxes  chosen  were  recorded  on  each  of  the  30  trials. 
Unknown  to  the  subject,  during  12  of  the  30  trials  there  was  no  disaster  box  (1,e.,  no  risk). 
This  was  done  to  get  a  clean  measure  of  risk-taking  behavior,  as  performance  on  the  disaster  box 
trials  might  have  been  affected  by  chance. 


Self-Crediting  Word  Knowledge 

Self-assessment  ability  and  self-confidence  are  the  psychological  attributes  measured  by  this 
test.  This  Is  essentially  a  vocabulary  test  where  the  subject  Is  presented  with  a  "target"  word 
and  five  other  words  from  which  Its  closest  synonym  has  to  be  chosen.  There  are  three  blocks  of 
ten  questions  each.  The  target  words  become  Increasingly  difficult  with  each  successive  block. 
The  subject  Is  Informed  of  this  Increasing  difficulty  and  Is  required  to  make  a  bet  prior  to  each 
block  which  reflects  how  well  he/she  expects  to  perform.  Response  time  and  accuracy  of  response 
were  recorded  on  each  of  the  30  trials. 


Activities  Interest  Inventory 

The  psychological  factors  underlying  this  test  are  survival  attitudes  and  risk-taking 
tendency.  This  test  Is  designed  to  determine  the  subject's  Interest  In  various  activities.  The 
subject  Is  presented  with  81  pairs  of  activities  and  Is  asked  to  Indicate  a  preference  for  each 
pair.  The  subject  Is  told  to  assume  that  he/she  has  the  necessary  ability  to  perform  each 
activity.  The  activity  pairs  force  the  subject  to  choose  between  tasks  that  differ  on  threat  to 
physical  survival— sometimes  subtly,  sometimes  not.  Here,  the  measures  of  Interest  were  the 
number  of  high-risk  options  chosen  and  the  average  amount  of  time  required  to  choose  between 
pairs  of  activities. 


Ewbgdded  Figures 


This  test  Is  designed  to  assess  the  psychological  factor  of  field  dependence/Independence. 
It  should  be  noted  that  level  of  field  dependence  has  been  treated  as  a  personality 
characteristic  by  some  researchers  and  as  a  perceptual  ability  by  others. 

As  this  test  has  been  examined  separately  In  another  paper  (Carretta,  1987),  It  will  not  be 
examined  In  detail  here.  However,  analyses  were  performed  to  determine  Its  relationship  to  the 
other  BAT  tests  discussed  In  this  paper. 

In  this  test,  the  subject  Is  presented  with  a  simple  geometric  figure  and  two  complex 
figures.  The  task  Is  to  decide  which  of  the  two  complex  figures  has  the  simple  figure  within  It 
and  to  Indicate  a  choice  by  pressing  the  keypad  button  corresponding  to  the  figure.  Speed  and 
accuracy  of  response  were  recorded  on  each  of  the  30  trials. 


UPT  Performance  Criteria 


UPT  final  training  outcome  was  scored  as  a  dichotomous  variable  with  Pass  ■  1  and  Fall  *  0, 
Subjects  who  passed  UPT  received  a  recommendation  from  an  Advanced  Training  Recommendation  Board 
(ATRB)  for  advanced  training  leading  to  an  assignment  either  as  a  Tanker-Transport-Bomber  (TTB) 
pilot  or  a  Fighter-Attack-Reconnaissance  (FAR)  pilot  (FAR  »  1  and  TTB  »  0), 


Apparatus 

The  BAT  apparatus  consists  of  a  super-microcomputer  built  Into  a  self-contained  unit  with  a 
glare  shield  and  side  panels  designed  to  ensure  consistency  of  testing  sessions.  The  subject 
responds  to  the  various  tests  using  In  combination  or  Individually  a  two-axis  joystick  on  the 
right  side  of  the  apparatus,  a  single-axis  joystick  on  the  left  side,  and  a  keypad  In  the  center 
of  the  test  unit.  The  keypad  Includes  the  numbers  0  to  9,  an  "EKABLE"  key  In  the  center,  and  a 
bottom  row  with  “YES"  and  "NO"  keys  and  two  others  labeled  “S/L"  (for  same/left  responses)  and 
"D/R"  (for  different/right  responses).  Figure  1  Is  a  picture  of  the  test  apparatus.  During  a 
test  session,  the  test  administrator's  keyboard  Is  stored  under  the  desk  of  the  test  apparatus. 

The  test  battery  as  used  In  this  study  consisted  of  15  tests  lasting  about  3  1/2  hours. 
After  a  test  administrator  Initiated  the  system,  the  test  session  was  self-paced  by  the  subject. 
The  test  session  Included  programmed  breaks  between  tests  to  avoid  problems  with  mental  and 
physical  fatigue. 


Procedure 


Prior  to  entry  Into  UPT,  each  subject  was  administered  both  the  AFOQT  and  the  BAT,  Pilot 
candidates  were  commissioned  through  either  the  Air  Force  Reserve  Officer  Training  Corps  (AFROTC) 
or  the  Air  Force  Officer  Training  School  (OTS).  Candidates  comnlssloned  through  AFROTC  took  the 
AFOQT  prior  to  entering  college  or  while  an  undergraduate.  For  AFROTC  candidates,  the  BAT  was 
administered  during  the  summer  of  their  junior  year  In  college.  For  the  OTS  candidates,  the 
AFOQT  was  administered  after  their  attainment  of  a  college  degree  and  the  BAT  was  administered  at 
the  beginning  of  their  participation  In  a  2-week  Flight  Screening  Program  (FSP), 

All  candidates  took  part  In  the  UPT  program,  which  lasts  49  weeks.  The  ATRB  decision  was 
made  at  the  42nd  week  of  UPT,  with  final  outcome  (pass/fall)  assigned  at  the  end  of  the  program. 
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III.  RESULTS 


AFOQT  Scores 


A  node!  that  used  the  raw  scores  froai  the  16  AFOQT  subtests  was  related  significantly  to  both 
UPT  performance  measures.  For  predicting  UPT  final  outcoeie  (graduation/el Imlnatlon,  £  >  .285, 
£  <.0001),  the  subtests  that  contributed  most  strongly  were  Instrument  Comprehension  (£  «  .218, 
£  <.0001)  and  Aviation  Information  (£  ■  .173,  £  <,  .01).  Scores  on  the  Rotated  Blocks  (_r  ■  .102, 
£  <  .10)  and  Arithmetic  Reasoning  subtests  lr_  ■  .053,  £  £  .10)  contributed  marginally  to 
prediction  of  UPT  final  outcome.  Although  other  subtests  had  larger  zero-order  correlations  with 
UPT  final  outcome  than  did  the  Arithmetic  Reasoning  subtest,  they  were  given  less  weight  In  the 
simultaneous  regression  model.  This  suggests  that  although  their  zero-order  correlations  were 
larger,  they  were  not  contributing  to  the  prediction  of  unique  variance  In  the  criterion  variable 
(UPT  final  outcome).  Similar  results  were  obtained  for  the  advanced  training  recommendation. 

For  the  ATRB  recommendation  (flghter/non-flghter  assignment,  £•  .273,  £  <  .001),  the  subtests 
that  contributed  significantly  were  Instrument  Comprehension  (£  •  .155,  £  ^  .05),  Block  Counting 
(r  «  -.008,  pl.OS),  and  Table  Reading  (£  -  .117,  £_<  .05).  Arithmetic  Reasoning  (£  ■  .129,  £5 
.To)  and  Word  Knowledge  (_r  ■  -.033,  £  <  .10)  scores  contributed  marginally  to  prediction  of 
advanced  training  recommendation. 

These  results  suggest  that  the  relative  Importance  of  the  ability  domains  measured  by  the  16 
AFOQT  subtests  may  change  during  the  course  of  training.  Procedural  knowledge  about  flying 
(e.g..  Aviation  Information)  acquired  before  entering  UPT  may  be  most  Important  during  the  early 
stages  of  training.  Individual  differences  In  procedural  knowledge  probably  decrease  during 
training  as  level  of  flying  experience  Increases.  During  the  later  stages  of  training  (when  the 
advanced  training  recoamendatlon  Is  made).  Individual  differences  In  Information  processing 
ability  become  more  Important  (e.g..  Arithmetic  Reasoning,  Instrument  Comprehension,  Table 
Reading).  These  regression  analyses  are  summarized  In  Table  4. 


Dot  Estimation 


Descriptive  Neasures 

This  test  provided  several  measures  to  evaluate  compulslveness/declslveness.  Including  the 
number  of  trials  completed,  number  of  correct  responses,  total  amount  of  time  spent  performing 
the  test,  average  response  time  for  correct  responses,  and  percent  cormet. 

As  can  be  seen  1n  Table  S,  the  average  number  of  trials  completed  was  49.6  out  of  a  maximum 
of  55.  As  previously  discussed,  this  test  was  designed  as  a  “speeded*  test;  thus,  few  subjects 
should  have  completed  all  Items.  On  speeded  tests,  performance  Is  determined.  In  part,  by  the 
number  of  trials  completed.  A  performance  "celling*  may  have  occurred  with  this  test  as  too  many 
subjects  completed  all  Items  (65S).  This  could  be  avoided  In  the  future  by  either  Increasing  the 
number  of  trials  or  reducing  the  time  limit  to  a  point  where  few  subjects  complete  all  Items. 

Average  number  correct  (31.7)  and  percent  correct  (65.6*)  were  acceptable,  as  subjects  were 
not  explicitly  Instructed  to  count  the  number  of  dots  In  each  box  before  making  a  choice. 


Table  4.  AFOQT  Subtest  Scores:  Suaery  of  UPT  Outcoae  Regression  Analyses 


Correlation  with  AFOQT  ■assures 
UPT  outcome  ATRB  outcome 
IM-0.66  J1-0.57 

AFOQT  tasure _ _ (W  «  812) _ (N  ■  S14) 

Subtest 


Verbal  Analogies 

-.044 

.040 

Arlthaetic  Reasoning 

.053 

.129 

Reading  Comprehension 

-.059 

.065 

Data  Interpretation 

.031 

.114 

Word  Knowledge 

-.088 

-.033 

Math  Knowledge 

-.026 

.039 

Mechanical  Comprehension 

.024 

.098 

Electrical  Maze 

.011 

.041 

Scale  Reading 

.031 

.095 

Instrument  Comprehension 

.218**** 

.155* 

Block  Counting 

.075 

-.008* 

Table  Reading 

.057 

.117* 

Aviation  Information 

.173** 

.121 

Rotated  Blocks 

.102 

.048 

General  Science 

.002 

.022 

Hidden  Figures 

.027 

.046 

All  15  Subtests  (multiple  R) 

.285**** 

.273*** 

note.  Significance  levels  (*)  refer  to  the  unique  contribution  of  a 
variable  In  the  context  of  a  reduced  set  of  variables  which  themselves 
contribute  uniquely  to  the  prediction  of  a  criterion.  Critical  values  for 
zero-order  correlations  at  the  .05  level  of  significance  are  .069  for  N  ■  800 
(UPT)  and  .088  for  N  -  500  (ATRB). 

*_£  <  .05. 

**£.1  ‘Ol* 

***£.<  .001. 

♦***£  <  .0001 . 


Table  5.  Dot  Estimation:  Means  and  Standard  Deviations 


Variable 

Mean 

SO 

Number  of  Trials  Completed 

49.6 

11.8 

Number  of  Correct  Responses 

31.7 

6.9 

Percent  Correct  (t) 

65.6 

10.3 

Total  Time  (ms.) 

1,143,796.1 

74,01 0.6 

Average  Response  Time  (ms.) 

(correct  responses) 

5,387.6 

4,750.1 

N  -  1,992. 
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Factor  Structura 


Th«  1nt«r>1tea  correlation  utrlx,  presantad  In  Table  6,  Indicates  that  there  was  a 
speed/accuracy  tradeoff.  As  subjects  coi^)1eted  nore  trials,  the  proportion  of  correct  responses 
declined  ■  -.65),  On  the  other  hand,  subjects  who  spent  aore  tiae  on  the  test  had  a  higher 
proport1on"of  correct  responses  on  the  trials  they  coapleted  (j;;^  ■  ,56), 


Table  6.  Dot  Estlaatlon:  Inter>Itaa  Correlation  Matrix 


Variable 

1 

2 

3 

4 

5 

1. 

Number  of  Trials  Completed 

1.00 

2. 

Number  of  Correct  Responses 

.87 

1.00 

3. 

Percent  Correct 

-.65 

-.23 

1.00 

4. 

Total  Time 

-.74 

-.58 

.56 

1.00 

5. 

Average  Response  Time 

-.92 

-.83 

.56 

.87 

1.00 

(correct  responses) 

M  •  1.992, 


The  factor  solution  Indicated  one  principal  factor  that  accounted  for  75, 7t  of  the  total  Item 
variance.  This  suggested  that  the  Dot  Estlaatlon  test  was  unidlaenslonal  In  nature.  Results  of 
the  factor  analysis  are  presented  In  Table  7, 


Table  7.  Dot  Estlaatlon:  SiaaMry  of  Factor  Analysis 


Variable 

Conainallty 

Factor  loadings  I 

Number  of  Trials  Completed 

.97 

-.98 

Number  of  Correct  Responses 

.60 

€0 

• 

1 

Percent  Correct 

.33 

.57 

Total  lime 

.67 

.82 

Average  Response  Time 

.99 

,99 

(correct  responses) 

%  of  total 

X  of  explained 

Cumulative  X 

Factor  Eigenvalue 

variance 

variance 

explained 

I  3,57 

75.7 

100.0 

100.0 

N  -  1,992. 


Inferential  Measures 


A  model  that  used  the  five  Dot  Estimation  scores  was  not  related  significantly  to  either  of 
the  UPT  performance  measures:  UPT  final  outcome  ■  .039,  n.s.),  ATRB  rating  (^  ■  .121,  n.s.). 
A  combined  model  that  used  the  16  AFOQT  subtest  scores  along  with  the  Dot  Estimation  scores  was 
related  statistically  to  UPT  final  outcome  (R^  •  .287,  £  <.0001)  and  to  advanced  training 
assignment  (£  -  .292,  £  <.001).  In  both  cases,  the  combined  model  failed  to  Improve  prediction 
above  that  provided  by  the  AFOQT  scores  alone  at  the  .05  level  of  probability.  A  summary  of  the 
Dot  Estimation  regression  analyses  Is  provided  In  Table  8. 
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Table  8.  Dot  Estlaatlon:  Sianary  of  UPT 
OutcoM  Regression  Analyses 


Predictor  measure 

Correlation  with  predictor 
UPT  outcome  ATRB  outcome 

M  >  0.66  M  -  0.57 

(N  -  812)  (N  -  514) 

Dot  Estimation  Variables 

Number  of  Trials  Completed 

-.015 

-.037 

Number  of  Correct  Responses 

-.005 

-.002 

Percent  Correct 

.025 

.052 

Total  Time 

.012 

.065 

Average  Response  Time 

.020 

.032 

(correct  responses) 

Multiple  Correlation 

Dot  Estimation 

.039 

.121 

16  AFOQT  Subtests 

.285**** 

.273*** 

Combined  Mode' 

.287**** 

.292*** 

^  Square  Change 

.001 

.011 

***£  <  .001. 
****£  <  .0001. 


Risk-Taking 


Descriptive  Measures 

The  most  conceptually  Interesting  performance  measures  on  this  test  were  the  average  number 
of  boxes  chosen  (1.e..  level  of  risk)  and  average  response  time  on  each  trial.  Table  9  sumnarlzes 
level  of  performance  on  the  “risk*  and  "no-risk"  trials. 


Table  9.  Risk-Taking:  Means  and  Standard  Deviations 


Variable 

Number 
of  trials 

Mean 

SD 

Number  of  Boxes  Chosen 

Risk 

18 

4.5 

0.8 

No  Risk 

12 

6.9 

1.3 

Average  Response  Time  (ms.) 

Risk 

18 

2,663.3 

1,675.6 

No  Risk 

12 

2,232.8 

1,608.8 

N  -  1,992. 


Performance  on  the  no-risk  trials  suggested  that  these  subjects,  In  general,  applied  a 
somewhat  risky  strategy  (average  number  of  boxes  chosen  •  6.9),  An  "optimizing"  strategy  would 
be  to  make  five  choices  per  trial  to  maximize  rewards  In  the  long  term.  Reliability  estimates 
were  calculated  separately  for  the  18  risk  and  12  no-risk  trials,  as  performance  on  the  risk 
trials  was  determined.  In  part,  by  chance.  The  number  of  boxes  chosen  was  much  less  reliable  for 


the  risk  trials  (Cronbech's  alpha  ■  .520)  than  for  the  no-risk  trials  (Cronbach's  alpha  ■  .954). 
However,  average  response  time  per  trial  was  reliable  for  both  risk  (Cronbach's  alpha  «  .910)  and 
no-risk  trials  (Cronbach's  alpha  »  .972). 


Factor  Structure 


The  Inter-Item  correlations,  presented  In  Table  10,  Indicated  that  the  two  "riskiness" 
measures  (number  of  boxes  chosen  during  risk  and  no-risk  trials)  were  moderately  correlated  with 
each  other  (_r  ■  .61)  but  not  with  average  response  time  per  trial  (-.06  <  _r  £  .01 ).  The  two 
average  response  time  Masures  were  related  strongly  to  each  other  (r  ■  .97). 


Table  10.  Risk-Taking:  Inter-Item  Correlation  Matrix 


Variable 

1 

Variable 

2  3 

4 

Number  of  Boxes  Chosen  (risk) 

1.00 

Number  of  Boxes  Chosen  (no  risk) 

.61 

1.00 

Average  Response  Time  (risk) 

-.06 

.00 

1.00 

Average  Response  Time  (no  risk) 

-.05 

.01 

.97 

1.00 

N  »  1,992. 


As  expected,  the  factor  analysis  yielded  two  factors;  namely,  response  latency  and 
level  of  risk.  The  principal  factor  consisted  of  the  two  average  response  time  variables  and 
accounted  for  49.4t  of  the  total  Item  variance  (61. 6X  of  the  "explained"  variance).  Both  of  the 
number  of  boxes  chosen  variables  loaded  on  the  second  factor,  which  accounted  for  39. 9S  of  the 
total  Item  variance  (38.4S  of  the  explained  variance).  The  factor  analysis  Is  suamarized  In 
Table  11. 


Table  11.  Risk-Taking:  Summary  of  Factor  Analysis 


Factor  loadings 

Variable 

Communal  Ity 

I 

11 

Number  of  Boxes  Chosen 

(risk) 

.61 

-.04 

.78 

Number  of  Boxes  Chosen  (no  risk) 

.61 

.02 

.78 

Average  Response  Time 

(risk) 

.97 

.98 

-.03 

Average  Response  Time 

(no  risk) 

.97 

.98 

-.01 

t  of  total 

t  of  explained 

Cumulative  % 

Factor  Eigenvalue 

variance 

variance 

explained 

I  1.94 

49.4 

61.6 

61.6 

II  1.21 

39.9 

38.4 

100.0 

N  -  1,992. 


Inferential  Measures 


As  with  the  Dot  Estimation  model,  performance  measures  from  the  Risk-Taking  test  demonstrated 
poor  predictive  utility  against  UPT  final  outcome  (^  ■  .066,  n.s.)  and  advanced  training 
assignment  (R  ■  .062,  n.s.). 
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A  c(Mibtn«d  model  that  used  the  Risk-Taking  measures  along  with  the  16  AFOQT  subtest  scores 
was  related  significantly  to  UPT  final  outcome  ■  .289,  £^.0001)  and  advanced  training 
recommendation  {£  ■  .282,  £^,01).  As  with  Dot  Estimation,  the  combined  model  did  not  Improve 
prediction  above  that  provided  by  the  AFOQT  alone.  The  Risk-Taking  regression  analyses  are 
summarized  In  Table  12. 


Table  12.  Risk-Taking:  Summary  of  UPT  Outcome  Regression  Analyses 


Predictor  measure 

Correlation  with  predictor 
UPT  outcome  ATRB  outcome 

M  -  0.66  N  -  0.67 

(M  «  812)  (¥  -  514) 

Risk-Taking  Variables 

Number  of  Boxes  chosen  (risk) 

-.053 

-.024 

Number  of  Boxes  chosen  (no  risk) 

-.029 

-.013 

Average  Response  Time  (risk) 

-.029 

-.009 

Average  Response  Time  (no  risk) 

-.023 

-.023 

Multiple  correlation 

Risk-Taking 

.066 

.062 

16  AFOQT  Subtests 

.285**** 

.273*** 

Combined  Model 

.289**** 

.282** 

R^  Square  Change 

.002 

.005 

i.  • 

***£  7  .001, 

****£  i  .0001. 


Self-Crediting  Word  Knowledge 


Descriptive  Heasures 

As  previously  mentioned,  this  test  Is  essentially  a  vocabulary  test  designed  to  measure 
self-assessment  ability  and  self-confidence.  Self-assessment  was  operationalized  as  the 
difference  between  the  subject's  expectations  (bet)  and  his/her  actual  performance  (number 
correct). 

As  shown  In  Table  13,  subjects'  actual  performance  (67,1%  correct)  far  exceeded  their 
expectations  (39.0%  correct).  Average  response  time  for  correct  responses  was  8,02  seconds.  A 
speed  by  accuracy  Interaction  term  was  calculated  by  multiplying  average  response  time  by  percent 
correct  and  correcting  for  the  means  on  those  variables.  As  the  Interaction  term  Is  strongly 
negative,  It  Indicated  that  subjects  who  made  more  correct  responses  also  responded  more  quickly 
(I.e.,  subjects  above  the  mean  on  one  variable  tended  to  be  below  the  mean  on  the  other  variable). 


Tabic  13.  Self-Crediting  Word  Knowledge: 
Means  and  Standard  Deviations 


Variable 

Mean 

SD 

Average  Response  Time  (ms.)  (correct  responses) 

8,022.5 

1,914.5 

Percent  Correct 

67,1 

10.5 

Bet 

39.0 

10,3 

Average  Response  Time  x  Percent  Correct 

-3,555,3 

24,830.7 

Mote.  The  Average  Response  Time  x  Percent  Correct  Interaction  term  was 
calculated  by  subtracting  the  grand  mean  from  each  subject's  mean  for  the 
two  variables  and  then  multiplying  the  two  difference  scores  together 
([subject's  average  response  time  -  grand  mean  response  time]  x  [subject's 
percent  correct  -  grand  mean  percent  correct]'^ 

N  ■  1,992. 

Accuracy  of  response  (Cronbach's  alpha  -  .653)  and  average  response  time  per  trial 
(Cronbach's  alpha  .885)  demonstrated  acceptable  reliability. 


Factor  Structure 


A  preliminary  evaluation  of  the  factor  structure  of  this  test  resulted  In  five  performance 
variables.  In  addition  to  average  response  time,  percent  correct,  bet,  and  the  speed/aceuracy 
Interaction  term,  a  fifth  variable— difference  between  actual  and  expected  perfomance  (percent 
correct  minus  bet)— was  calculated.  The  fifth  variable  was  dropped,  however,  because  It  was 
correlated  too  strongly  with  the  other  variables  and  resulted  In  a  communal Ity  value  equal  to  1.0. 

The  Inter-Item  correlations,  summarized  In  Table  14,  Indicated  that  the  remaining  variables 
were  not  related  strongly  to  each  other.  As  expected,  actual  and  expected  performance  were 
moderately  related  (^  ■  .33).  Average  response  time  was  negatively  related  to  actual  (_r  "  -.16) 
and  expected  (j^  ■  -.21)  performance.  Subjects  who  were  more  self-confident  (bet  more)  were  more 
accurate  and  responded  more  quickly  than  did  subjects  who  were  less  self-confident  (bet  less). 

Table  14.  Self-Crediting  Word  Knowledge: 

Inter- Item  Correlation  Matrix 


Variable 

1 

Variable 

2  3 

4 

Average  Response  Time  (correct  responses) 

1.00 

Percent  Correct 

-.16 

1.00 

Bet 

-.21 

.33 

1.00 

Average  Response  Time  x  Percent  Correct 

-.13 

-.19 

.00 

1.00 

N  •  1,992. 


The  factor  analysis  produced  two  factors  which  together  accounted  for  65.6%  of  the  total  Item 
variance.  The  two  "accuracy"  scores  (percent  correct  and  bet)  defined  the  principal  factor, 
while  average  response  time  and  the  speed/accuracy  Interaction  term  defined  the  second  factor. 

These  two  factors  reflected  the  crucial  components  of  this  test;  namely,  accuracy/ 
self-confidence  and  response  speed.  Results  of  the  factor  analysis  are  summarized  In  Table  15. 
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Table  IS.  Self-Credittng  Word  Knowledge; 
Sinury  of  Factor  Analysis 


Factor  loadings 

Variable 

Communal Ity 

I 

II 

Average  Response  Time  (correct  responses) 

.21 

-.39 

CM 

« 

0 

Percent  Correct 

.49 

.62 

.32 

Bet 

.28 

.53 

.01 

Average  Response  Time  x 

Percent  Correct 

.32 

-.01 

.56 

%  of  total 

i  of  explained 

Cumulative  % 

Factor  Eigenvalue 

variance 

variance 

explained 

I  0.85 

36.9 

65.6 

65.6 

11  0.45 

28.7 

34.4 

100.0 

N  >  1,992. 


Inferential  Measures 


The  Self- Crediting  Word  Knowledge  model  was  related  statistically  to  UPT  final  outcome  (R^  ■■ 
.157,  £  <  ,001)  but  not  to  advanced  training  recommendation  U  *  .036,  n.s,).  Contrary  to 
expectations,  subjects  who  took  longer  to  respond  were  more  likely  to  pass  UPT  (£  >  .141, 
£^.001).  Those  who  took  longer  to  respond  may  have  been  showing  caution  rather  than  a  lack  of 
confidence. 

A  combined  model  that  used  the  16  subtest  scores  from  the  AFOQT  along  with  the  scores  from 
the  Self- Crediting  Word  Knowledge  test  was  related  statistically  to  UPT  final  outcome  ■  ,312, 
£^,0001),  and  significantly  Improved  prediction  above  that  provided  by  the  16  AFOQT  subtests 
alone  {£[4,791]  ■  3.53,  £i»01).  For  the  ATRB  outcome,  the  combined  model  showed  little 
Improvement  over  the  AFOQT  scores  alone.  Table  16  provides  a  susnary  of  these  regression 
analyses. 


Activities  Interest  Inventory 


Descriptive  Weasures 


As  with  Risk-Taking,  this  test  was  designed  to  assess  attitudes  toward  risk-taking.  The 
primary  measure  of  Interest  was  the  number  of  high-risk  activities  chosen  by  each  subject  from 
the  activity  pairs. 

The  average  number  of  high-risk  activities  chosen  was  49.6  out  of  81  (61.2*),  Average 
response  time  per  trial  was  4.48  seconds.  The  number  of  high-risk  activities  chosen  and  average 
response  time  were  not  statistically  related  to  each  other  U  ■  -.07),  The  reliabilities  of 
response  choice  (Cronbach's  alpha  ■  .864)  and  response  time  (Cronbach's  alpha  ■  .954)  were 
acceptable.  Table  17  presents  the  means  and  standard  deviations  for  these  measures.  A  factor 
analysis  was  not  performed  because  there  were  only  two  variables. 


Inferential  Measures 

Scores  on  this  tost  were  not  statistically  related  to  UPT  final  outcome  (R_  «  .043,  n.s.)  or 
advanced  training  recommendation  (R  -  .061,  n.s.). 
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Table  16t  Self-Creditlag  Word  Knowledge: 
SuMMry  of  UPT  OutcoM  Regression  Analyses 


Predictor  Measure 


Correlation  with  predictor 
UPT  OMtcot  ATRB  outcoew 
M-0.66  0.57 
(N  -  812)  (N  -  514) 


Self-Crediting  Word 
Knowledge  Variables 


Average  Response  Time 


■n- 

(correct  responses) 

.141*** 

-.026 

4 

Percent  correct 

-.074 

.026 

4 

Bet 

Average  Response  Time 

-.063 

.019 

X  Percent  correct 

Multiple  Correlation 

.029 

8 

O 

Self-Crediting  Word  Knowledge 

.157*** 

.036 

16  AFOQT  Subtests 

.285**** 

.273*** 

Combined  Model 

.312**** 

.278** 

R^  Square  Change 

.016** 

.003 

Note.  Significance  levels  (*)  refer  to  the  unique  contribution  of  a 
variable  In  the  context  of  a  reduced  set  of  variables  which  themselves 
contribute  uniquely  to  the  prediction  of  a  criterion.  Critical  values  for 
zero-order  correlations  at  the  .05  level  of  significance  are  .069  for  N  »  800 
(UPT)  and  .088  for  N  •  500  {ATR8). 

**p  <_  .  01 . 

***p  £  .001. 

****p  <  .0001. 


Table  17.  Activities  Interest  Inventory: 

Means  and  Standard  Deviations 

Variable _ Mean _ SO 

Number  of  High-Risk  49.6  9.9 

Activities  Chosen 

Average  Response  Time  4,483.8  1,080.3 

per  Trial 

A  combined  model  that  used  the  16  AFOQT  subtest  scores  along  with  the  Activities  Interest 
Inventory  scores  was  related  statistically  to  final  training  outcome  (_R  “  .291,  2.1  *0001)  and  to 
advanced  training  recoamendatl on  ■  .276,  £  <  .001 )  but  did  not  Improve  prediction  signifi¬ 

cantly  over  a  model  that  used  only  the  AFOQT  subtests.  The  Activities  Interest  Inventory 
regression  analyses  are  sumnarlzed  In  Table  18. 


Table  13.  Activities  Interest  Inventory: 
Suwiiry  of  UPT  OutcoM  Regression  Analyses 


Correlation  with  predictor 
UPT  outcoM  ATRB  outcoae 


Prmdlctor  measure 

M  -  0.66 
(N  -  812) 

M  -  0.57 
(N  «  514) 

Activities  Interest 

Inventory  Variables 

Number  of  High-Risk  Activities  Chosen 
Average  Response  Time 

-.020 

-.036 

.049 

-.042 

Multiple  Correlation 

Activities  Interest  Inventory 

16  AFOQT  Subtests 

Combined  Model 

.043 

.285***"* 

.291**** 

.061 

.273*** 

.276*** 

Square  Change 

.003 

.002 

***£  <  .001. 

*♦**2  <_  .0001. 

Bibedded  Figures 

Descriptive  Measures 

The  most  Important  performance  measures  on  this  test  were  accuracy  of  response  and  average 
response  time.  Although  overall  accuracy  of  response  was  acceptable  (65.St  correct),  accuracy 
fell  below  "chance  level"  (SO*)  on  11  of  the  30  trials.  Most  of  these  trials  exhibited  low 
correlations  with  the  Item-total  score,  suggesting  that  the  stimuli  used  on  these  trials  were 
poor  discriminators  of  performance  and  should  be  eliminated  from  this  test.  Despite  this 
problem,  responses  were  fairly  reliable  (Cronbach's  alpha  •  .702). 

Average  response  time  for  correct  responses  was  12.2  seconds  and  was  very  reliable 
(Cronbach's  alpha  ■  .915).  These  descriptive  measures  are  susmiarlzed  In  Table  19. 


Table  19.  Embedded  Figures: 
Means  and  Standard  Deviations 


Variable 

Mean 

SD 

Average  Response  Time  (ms) 

12.200.0 

4.802.9 

(correct  responses) 

Percent  Correct  (%) 

65.5 

14.5 

N  •  1.992. 


Additional  details  regarding  the  Items  of  this  test  (e.g..  Item-total  correlations.  Inter-Item 
correlations,  and  factor  structure)  are  not  discussed  here  but  are  available  In  an  earlier  paper 
(Carretta.  1987).  Carretta  (1987)  suggested  that  performance  on  the  Embedded  Figures  test  could 
be  summarized  by  three  variables:  average  response  time,  accuracy  of  response,  and  a  response 
time  by  accuracy  Interaction  term. 


InfTtntttI  Mtt$ur>s 

The  Eabedded  Figures  model  (average  response  time,  percent  correct,  and  response  time  by 
percent  correct  Interaction  term)  demonstrated  poor  predictive  utility  against  both  of  the  UPT 
performance  criteria.  The  model  was  not  statistically  related  to  UPT  pass/fall  outcome  (£  - 
.050,  n.s. )  or  to  advanced  training  recommendation  *  .089,  n.s.). 

When  the  Embedded  Figures  model  was  combined  with  the  16  AF0(}T  subtest  scores,  the  combined 
model  was  related  statistically  to  both  UPT  final  outcome  (^  ■  .296,  £j<  .0001)  and  advanced 
training  recommendation  (£  ■  .293,  £  <  .001).  The  combined  model,  however,  did  not 

significantly  Improve  prediction  of  performance  above  that  provided  by  the  AFOQT  scores  alone. 
Results  from  these  regression  analyses  are  presented  In  Table  20. 


Table  20.  Ead>edded  Figures:  Summary  of  UPT  Outcome  Regression  Analyses 


Correlation  with  predictor 

UPT  outcoam 

ATRB  outcome 

M  -  0.66 

M  -  0.57 

Predictor  Neasure 

(N  -  812) 

(N  «  SI 4) 

Embedded  Figures  Measures 

Average  Response  Time 

-.005 

8 

O 

• 

1 

Percent  Correct 

-.046 

.039 

Average  Response  Time 

X  Percent  Correct 

-.016 

.075 

Multiple  Correlation 

Embedded  Figures 

.050 

.089 

16  AFOQT  Subtests 

,2BS**** 

.273*** 

Combined  Model 

.296**** 

.293*** 

^  Square  Change 

.006 

.011 

*^p  <  .001. 

****p  £.0001. 

Integrated  Model 


Factor  Structure 


A  factor  analysis  was  performed  using  the  18  variables  from  the  five  tests  In  order  to 
determine  the  relationships  among  them.  An  examination  of  the  Inter-Item  correlation  matrix 
presented  In  Table  21  reveals  that  there  are  few  strong  correlations  between  variables  from 
different  tests.  This  suggests  that  there  Is  little  overlap  among  the  tests  In  the 
characteristics  being  measured.  Although  the  variables  within  Dot  Estimation  and,  to  a  lesser 
extent,  Risk-Taking  demonstrated  good  Internal  consistency,  those  from  the  other  three  tests  did 
not.  The  Self -Crediting  Word  Knowledge,  Activities  Interest  Inventory,  and  Embedded  Figures 
tests  lacked  clear  factor  relationships. 
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Table  21.  BAT  Personal  Ity/Attltudlnal  Tests:  Inter>Itea  Correlation  Natrix 
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The  factor  analysis,  presented  In  Table  22,  produced  a  six-factor  solution  that  accounted  for 
65.1%  of  the  total  Item  variance.  Only  factor  loadings  with  a  magnitude  of  .30  or  higher  are 
presented.  In  order  to  simplify  the  table.  The  principal  factor  can  be  Interpreted  as 
"speededness*  or  "compulslveness"  as  the  five  variables  from  Dot  Estimation  were  the  only  ones 
that  loaded  on  It.  Factors  II  (response  latency)  and  III  (riskiness)  were  defined  by  variables 
from  Risk-Taking.  Contrary  to  expectations,  the  Activities  Interest  Inventory  variables  did  not 
cluster  with  those  from  Risk-Taking,  although  both  tests  were  designed  to  assess  attitudes  toward 
risk-taking.  The  remaining  three  factors  were  uninterpretable  as  each  was  defined  by  only  two  or 
three  variables  and,  as  a  result,  lacked  stability. 

Table  22.  BAT  Personal 1ty/Att1tud1nal  Tests:  Summary  of  Factor  Analysis 


Factor  loadings 


Variable 

Communal Ity 

I 

11  III  lY  Y 

Dot  Estimation 

N  Trials  Completed 

.94 

-.96 

N  Correct  Responses 

.99 

-.93 

.36 

Total  Time 

.70 

.76 

.30 

Average  Response  Time 

.97 

.97 

Percent  Correct 

.67 

.57 

.60 

Risk-Taking 


N  Boxes  Chosen  (risk) 

.57 

.75 

N  Boxes  Chosen  (no  risk) 

.64 

.80 

Average  Response  Time  (risk) 

.95 

.96 

Average  Response  Time  (no  risk) 

.98 

.98 

Self-Crediting  Word  Knowledge 

Average  Response  Time 

.63 

.79 

Percent  Correct 

.61 

.71 

RT  by  %  Correct 

.09 

Bet 

.26 

.32 

Activities  Interest  Inventory 

N  High-Risk  Choices 

.02 

Average  Response  Time 

.26 

.49 

Embedded  Figures 

Average  Response  Time 

.09 

Percent  Correct 

.05 

RT  by  t  Correct 

.02 

%  of  total 

%  of  explained 

Factor  Eigenvalue 

variance 

variance 

Cumulative  % 

I  3.78 

21.8 

40.0 

40.0 

II  2.07 

12.4 

22.0 

62.0 

III  1,26 

9.1 

13.3 

75.3 

IV  1.04 

8.4 

11,0 

86.3 

V  .73 

7.2 

7.7 

94.0 

VI  .66 

6.2 

6.0 

100.0 

Note.  Factor  Loadings  less  than 

.30  omitted. 

N  -  1,992. 


The  goal  of  this  factor  analysis  was  to  Identify  the  coemon  and  unique  variance  among  the  18 
variables  from  the  five  BAT  tests,  and  to  produce  a  minimum  number  of  meaningful  factor  scores  to 
be  used  as  predictors  of  flight  training  performance.  However,  because  the  factor  solution  was 
not  clear,  the  orl0lna1  18  variables,  rather  than  the  factor  scores,  were  used  In  an  Integrated 
model . 


Inferential  Measures 


A  34-pred1ctor  regression  equation  that  used  the  16  AFOQT  subtest  scores  along  with  the  18 
BAT  variables  was  related  significantly  to  UPT  final  outcome  (JR  ■  .346,  £  <,.0001).  This  model 
was  compared  to  a  reduced  model  that  also  was  related  significantly  to  UPT  final  outcome  (AFQQT 
subtests  and  Self-Crediting  Word  Knowledge  scores,  £  •  .312,  j^^.OOOl).  The  two  models  did  not 
differ  significantly  In  their  predictive  utilities  (£[14,777]  ^1.41,  n.s.).  That  Is,  scores 
from  the  Dot  Estimation,  Risk-Taking,  Activities  Interest  Inventory,  and  Embedded  Figures  tests 
did  not  significantly  Improve  the  prediction  of  successful  completion  of  pilot  training  above 
that  provided  by  the  AF(X)T  subtests  and  Self-Crediting  Word  Knowledge  scores.  The  34-pred1ctor 
AFOQT/S  BAT  test  model  was  related  significantly  to  advanced  training  recoimendatlon  (£  *  .326, 
p  1.01)  but  did  not  significantly  Improve  prediction  above  that  provided  by  the  16  AFOQT 
subtests  alone  (£[18,499]  ■  0.98,  n.s.). 


Summary 

The  AFOQT  subtest  scores  as  a  group  demonstrated  a  moderately  strong  relationship  with  UPT 
performance.  It  should  be  noted  that  the  relative  Importance  of  the  16  subtests  differed  for  the 
two  flying  training  outcome  measures. 

The  five  sets  of  personality  measures  from  the  BAT  were  sufficiently  reliable  to  be  used  In 
selection  systems;  however,  none  of  the  BAT  tests  was  related  statistically  to  both  UPT  final 
outcome  and  advanced  training  recommendation.  Performance  on  the  Self-Crediting  Word  Knowledge 
test  was  related  to  UPT  final  outcome.  Subjects  who  took  longer  to  respond  (I.e.,  were  more 
cautious)  were  more  likely  to  complete  training  successfully. 


IV.  DISCUSSION 

There  are  several  explanations  for  the  poor  predlctlvn  utility  demonstrated  by  these 
personal Ity/attitudinal  tests.  One  explanation  Is  that  the  BAT  tests  may  not  be  measuring  the 
characteristics  they  were  designed  to  measure  (I.e.,  poor  construct  validity).  Although  each 
test  was  adapted  from  a  previously  validated  paper-and-pencll  test,  no  subjects  were  given  both 
the  BAT  and  the  paper-and-pencll  versions  of  the  tests.  As  a  result,  the  BAT  tests  can  be 
evaluated  In  terms  of  face  validity,  but  not  construct  validity. 

Even  If  the  BAT  tests  have  acceptable  construct  validity,  scores  on  them  were  not  found  to  be 
related  strongly  to  pilot  training  performance.  Subjects  In  this  study  may  have  been  too  similar 
to  one  another  In  terms  of  the  characteristics  measured  by  these  tests,  or  they  may  have  been 
faking  their  responses  to  present  a  positive  Image  to  others,  or  their  "true"  personalities  may 
not  have  emerged  because  of  situational  pressures.  Another  possible  explanation  Is  that  a 
"personal Ity/attitudinal  profile"  that  considered  several  characteristics  together  might  be 
related  more  closely  to  training  performance  than  would  any  single  characteristic  alone. 
Although  the  personal  Ity/attitudinal  profile  hypothesis  was  not  supported  by  results  from  the 
Integrated  model,  this  does  not  mean  that  personality  and  attitudes  are  not  related  to  flying 
training  performance  or  that  research  with  personal 1ty/att1tud1na1  measures  should  be  abandoned. 
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Recent  efforts  by  Spence  and  others  (e.g.,  Spence  i  Helmretch,  1983;  Spence,  Helmreich,  i 
Holahan,  1979)  have  yielded  promising  relationships  among  measures  of  Interpersonal  skills,  need 
for  achievement,  and  pilot  performance.  In  a  research  effort  being  sponsored  by  the  National 
Aeronautics  and  Space  APhnInI strati on  and  the  US  Navy,  other  personality  attributes  not  considered 
here  are  being  evaluated  Including  measures  of  locus  of  responsibility  (Reid  A  Ware,  1973), 
Instrumentality  and  Interpersonal  orientation  (Spence  et  a1.  1979),  mastery  and  competitiveness 
(Spence  A  Helmreich,  1983)  and  other  personality  factors  (Dahlstrom,  Welsh,  A  Dahlstrom,  1972). 


Y.  CONCLUSIONS 

Each  of  the  five  BAT  tests  Included  In  this  study  exhibited  acceptable  reliability^  However, 
none  of  them  was  related  statistically  to  both  measures  of  flying  training  performance 
( graduatl on/e 11m1 nation,  advanced  training  recommendation).  Performance  on  the  Self-Crediting 
Word  Knowledge  test  was  related  statistically  to  UPT  final  outcome. 

As  a  result.  It  Is  suggested  that  only  the  Self-Crediting  Word  Knowledge  test  be  retained  In 
the  BAT  battery.  Future  studies  are  planned  to  evaluate  the  construct  validity  of  this  test  by 
a(tn1n1ster1ng  It  with  other  measures  of  self-confidence  and  self-assessment. 
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